An openedx_catalog app with a representation for CourseRuns [FC-0117]#479
An openedx_catalog app with a representation for CourseRuns [FC-0117]#479bradenmacdonald wants to merge 29 commits intomainfrom
Conversation
|
Thanks for the pull request, @bradenmacdonald! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
I don't know if we'll force it to be stable, but we'll probably want a |
17b313f to
8177a58
Compare
87f845c to
ccebc9d
Compare
88e2f82 to
bf0fbd6
Compare
Historical note: Apparently there was an old PR to add an opaque key for catalog courses, but it never merged: openedx/opaque-keys#87 Seems like they were called "aggregate courses" then. |
8d56dc0 to
d67189d
Compare
ac60e0e to
b83e3f7
Compare
kdmccormick
left a comment
There was a problem hiding this comment.
Partial review as I'm signing off for tonight, but overall really solid. I appreciate the comments and validation a lot. My only concerns so far are some superficial naming stuff.
| help_text=_("The internal database ID for this course. Should not be exposed to users nor in APIs."), | ||
| editable=False, | ||
| ) | ||
| course_id = CourseKeyField( |
There was a problem hiding this comment.
| course_id = CourseKeyField( | |
| course_key = CourseKeyField( |
Could we call this course_key? *_id in Django usually a foreign key, and I think it's unfortunate we've used course_id in so many places. I'd love to standardize on *_key for anything that's an OpaqueKey instance.
There was a problem hiding this comment.
Yeah, I should have done that in the first place. Done: 88aef75
| # Enforce that the course ID must end with "+run" where "run" is an exact match for the "run" field. | ||
| # This check may be removed or changed in the future if our course ID format ever changes | ||
| models.CheckConstraint( | ||
| # Note: EndsWith() on SQLite is always case-insensitive, so we code the constraint like this: | ||
| condition=Exact(Right("course_id", Length("run") + 1), Concat(models.Value("+"), "run")), | ||
| name="oex_catalog_courserun_courseid_run_match_exactly", | ||
| violation_error_message=_("The CourseRun 'run' field should match the run in the course_id key."), | ||
| ), |
There was a problem hiding this comment.
I have a sense that we'll eventually want to relax this in order to allow sites more flexibility on how they market their course runs, but I agree with adding the constraint for now and seeing how it plays out.
There was a problem hiding this comment.
Turns out that I already had to relax it a bit, because it was breaking on CCX keys like ccx-v1:org+code+run+ccx@1 which don't end with +run. In fact, I realized CCX keys break a few different assumptions here - they also violate the constraint that (org, code, run) is unique per course run, because all CCX variants of a run have the same base course ID.
There was a problem hiding this comment.
@bradenmacdonald I'm rusty on the CCX data model. Is every CCX considered a "course run" by the existing system? I know a CCX is a LearningContext, but does every CCX have a CourseOverview row?
There was a problem hiding this comment.
Yes, every CCX instance is seen as a fully separate course run by most parts of the system (except Studio, which doesn't list them nor interact with them at all; CCX runs are created/edited/managed strictly through the LMS).
I just tested this now; here are the CourseOverviews of a CCX base course plus a CCX run created from it:

4dd8ad7 to
43f7b7c
Compare
| return # It's a brand new Organization; we don't care | ||
|
|
||
| prev_org_code = Organization.objects.get(pk=instance.pk).short_name | ||
| new_org_code = instance.short_name |
There was a problem hiding this comment.
The logic below looks solid, but I'm curious what it handles that a this simpler logic wouldn't handle? ⬇️
if prev_org_code.lower() != new_org_code.lower():
# If there are any runs, then changing the org code (other than capitalization) is forbidden.
if CourseRun.objects.filter(catalog_course__org=instance).exists():
raise ValidationError(...)Is it so that if a CourseRun's course_key changes, then the org table can be updated to match? If so, do you mind dropping that in the comments?
There was a problem hiding this comment.
I think the only such situation is this:
- An Organization "MITx" is renamed to "MIT" -> meanwhile all the associated course runs have keys like
course-v1:MIT+foo+bar. Don't throw an error, because this is now "more correct".
This situation shouldn't really be possible if you are using these models correctly, but the "org.short_name matches course_key.org" rule is not actually enforced by the database, because we can't write constraints across table boundaries, and it involves parsing an opaque key. So it is possible if you are using the .update() manager API or raw/custom SQL, or anything else that bypasses the checks in course_run.clean()
The other advantage is that it states an exact, example course key in the ValidationError.
That said, I don't think the situation described above is likely to occur so I'd be fine with simplifying this to your suggestion.
There was a problem hiding this comment.
Thinking about it more, I think I like what you have here. Generally speaking, it seems good to allow data changes towards correctness, rather than forbidding changes entirely. I can think of times where I've been burned by validation which whose intent was to keep my data correct, but it also effectively locked incorrect data into remaining incorrect. I'd say just drop a quick comment in to explain that, and keep it as-is.
kdmccormick
left a comment
There was a problem hiding this comment.
Just a few requests for more words, looks great otherwise. I'll a look at the platform PR soon.
| return new_run | ||
|
|
||
|
|
||
| def delete_course_run(course_key: CourseKey) -> None: |
There was a problem hiding this comment.
Could you add more to this docstring? What is and isn't deleted by this function? And will that change when CourseRun becomes authoritative?
There was a problem hiding this comment.
Sure: a27856f
Let me know if you have any thoughts on that, as I wasn't totally sure how it should behave or how we see it evolving.
| # Enforce that the course ID must end with "+run" where "run" is an exact match for the "run" field. | ||
| # This check may be removed or changed in the future if our course ID format ever changes | ||
| models.CheckConstraint( | ||
| # Note: EndsWith() on SQLite is always case-insensitive, so we code the constraint like this: | ||
| condition=Exact(Right("course_id", Length("run") + 1), Concat(models.Value("+"), "run")), | ||
| name="oex_catalog_courserun_courseid_run_match_exactly", | ||
| violation_error_message=_("The CourseRun 'run' field should match the run in the course_id key."), | ||
| ), |
There was a problem hiding this comment.
@bradenmacdonald I'm rusty on the CCX data model. Is every CCX considered a "course run" by the existing system? I know a CCX is a LearningContext, but does every CCX have a CourseOverview row?
| # Note: display_name should never be blank. But we previously didn't store a name for catalog courses in the core. | ||
| # For backfilling, if there is only one run, we use that run's name as the catalog course name. Otherwise, we can | ||
| # use the org + course code as the display name. | ||
| display_name = case_insensitive_char_field( |
There was a problem hiding this comment.
Doesn't openedx_content use title instead of display_name? Would it make sense to use title here and in CourseRun too?
There was a problem hiding this comment.
If that's what we're moving toward, sure. I don't have any preference.
I'm wondering if we need a little guide of term preferences somewhere. vertical -> unit, sequential -> subsection, chapter -> section, course_id -> course_key, number -> course_code, display_name -> title, etc. There are a lot.
There was a problem hiding this comment.
I should also mention that OLX uses display_name at every level, so we probably can't really change that completely. I do think title is better though.
Co-authored-by: Kyle McCormick <kyle@axim.org>
An implementation of #469.
Related platform PR: openedx/openedx-platform#38023
Notes
edx-organizations(which also requires Pillow for thelogofield)opaque-keys, which was already an indirect dependency but not used directlydisplay_name, and catalog courses do not have names nor exist in the core platform at all. I'm proposing we add adisplay_namefield to the new coreCatalogCoursemodel to support various use cases, including the proposed new studio home page. See the code for how this can be backfilled and how runs can always override the name for each run.CatalogCourse/CourseRunobjects; just a minimal API that platform code can use to keep them in sync withCourseOverview.course_overviewsapp that syncs data from modulestore ->CourseOverview->openedx_catalog. It would be more robust and future-friendly to instead sync data directly fromSplitModulestoreCourseIndex->openedx_catalog, but it's harder to get information likedisplay_nameandlanguagein the latter case as that's not available in theSplitModulestoreCourseIndextable. It could be retrieved from Mongo though.CourseOverviewbased on thecourse_publishedsignal, any test cases in platform that want to use these models have to be sure to enable that signal, which is disabled for test by default.CourseRunwhenCourseOverviewis updated, and CourseOverview itself is updated after courses are already created, it's possible that an error will occur and theCourseRunwill never get created or updated. This will result in an error in the logs, but will not block course creation etc., so the error is likely to go unnoticed with the system as it exists today.Architecture Diagram
See ARCHITECTURE.md.
Questions
org_codeandcourse_codegood terms to use? Should I call the latternumberinstead, like other parts of the code do? Should I call the org partorg_short_name?url_codefor each CatalogCourse useful? Do we want to make it editable now, or in the future?Can be addressed later:
CourseRunaSoftDeletableModelto support course deletion without data loss? (Maybe we can't now, because soft-deleting it in that one table wouldn't affect the other tables that the system actually references. But in the future we could add this.)openedx_contentcatalog_visibility,visible_to_staff_only, andcourse_visibilitywhich all have different effects and different enum values, and also need to support "use system default")OrganizationCourse